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Do  We  Fully  Understand  the  Symmetric 
Lanczos  Algorithm  Yet?  * 

Beresford  N.  Parlett  * 


Abstract 

Imagine  that  one  has  computed  the  real  n- vectors  6,  Ab,  A2  6, . . . ,  ATn~1b  where  A  is  a 
real  symmetric  nxn  matrix.  Lanczos  showed  us  in  1950  how  to  construct  a  much  better 
basis  for  the  (Krylov)  space  spanned  by  these  power  vectors  and  for  little  extra  cost. 

The  new  basis  {<?i, ^2>  •  •  • , 9m}?  now  called  the  Lanczos  basis,  has  two  nice  properties: 

(i)  it  is  orthonormal,  (ii)  the  representation  of  A1  s  projection  is  a  symmetric  tridiagonal 
matrix  Tm.  Property  (ii)  is  synonymous  with  the  three  term  recurrence  governing  the 
Lanczos  vectors.  Moreover  some  of  Tm’s  eigenvalues,  called  Ritz  values  hereafter,  are 
excellent  approximations  to  some  of  A\ s  eigenvalues  even  when  m  «  n.  In  addition 
we  can  tell,  with  little  expense,  which  Ritz  values  are  also  eigenvalues.  One  surprising 
implication  of  these  properties  is  that  it  is  easier  to  find  the  largest  few  eigenvalues  of 
A  than  to  solve  Ax  =  b\ 

When  the  Lanczos  algorithm  is  implemented  in  a  computer  the  user  discovers 
an  unpleasant  fact.  Property  (i)  fails  completely  for  m  as  small  as  20  or  30  and 
consequently  the  computed  Tm}s  relation  to  A  is  unclear.  Lanczos  was  aware  of  this 
blemish  and  proposed  the  obvious  remedy:  keep  applying  the  Gram-Schmidt  process  to 
each  new  Lanczos  vector  as  it  is  computed.  The  catch  here  is  that  all  the  {<&}  must  be 
kept  handy  whereas  in  exact  arithmetic  only  the  three  latest  Lanczos  vectors  are  needed 
and  earlier  q* s  may  be  discarded.  The  arithmetic  cost  of  this  full  reorthogonalization 
grows  quadratically  with  m.  So  the  hope  of  computing  Tn  efficiently  and  accurately  by 
the  Lanczos  algorithm  was  dashed  and  other  methods  prevailed.  In  exact  arithmetic 
Tn  is  similar  to  A  and  the  algorithm  stops. 

What  Lanczos  and  Wilkinson  both  failed  to  see  is  that  there  is  structure  in  the  way 
that  orthogonality  is  lost.  This  structure  is  revealed  by  a  clever  change  of  basis  and  it 
was  discovered  by  C.  C.  Paige  in  1969/1970  while  writing  his  dissertation.  Moreover 
the  computed  Tm  retains  information  about  A.  Thus  the  loss  of  orthogonality  delays 
the  discovery  of  A’s  eigenvalues  by  the  simple  Lanczos  algorithm  but  does  not  prevent 
the  attainment  of  full  accuracy  if  enough  steps  are  taken.  In  finite  precision  arithmetic 
the  simple  Lanczos  algorithm  will  run  forever  and  we  are  just  beginning  to  come  up 
with  good  models  that  describe  how  Tm  relates  to  A  when  m»  n. 

1  Introduction 

Today  the  Lanczos  algorithm  seems  so  natural,  so  inevitable,  and  so  simple  that  it  is 
difficult  to  imagine  that  it  was  not  part  of  the  numerical  scene  until  1950.  Of  course  digital 
computers  did  not  appear  until  the  end  of  World  War  II  but,  equally  important,  is  the  fact 
that  the  concept  of  a  tridiagonal  matrix  was  not  in  the  mental  tool  box  of  a  scientist  in 
1950.  A  tridiagonal  matrix  does  not  reveal  its  eigenvalues  or  eigenvectors  to  the  casual  eye 
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so,  at  best,  it  seemed  a  mere  stepping  stone  to  the  desired  spectral  information.  Indeed  it 
was  not  until  1954  that  safe  and  efficient  ways  of  computing  the  spectrum  of  a  symmetric 
tridiagonal  matrix  were  discovered.  For  some  applications,  such  as  solving  large  systems 
of  differential  equations,  a  tridiagonal  representation  may  suffice  and  diagonalization  is 
overkill,  see  (13],  [8]. 

This  essay  will  discuss  the  Lanczos  algorithm  in  the  context  of  exact  arithmetic  and  in 
the  context  of  computer  arithmetic.  A  sequence  of  pictures  is  presented  that  reveals,  in 
striking  detail,  the  structure  governing  the  finite  precision  algorithm. 

The  essay  ends  with  a  brief  discussion  of  recent  work  designed  to  bound  the  number  of 
steps  needed  to  determine  a  given  ‘well  represented’  eigenvalue  to  working  accuracy.  Indeed 
it  is  not  at  all  obvious  that  the  tridiagonal  matrix  produced  by  the  Lanczos  algorithm  will 
eventually  have  a  spectrum  that  comes  arbitrarily  close  to  each  eigenvalue  of  the  given 
matrix,  even  if  the  algorithm  is  run  forever. 

Let  us  now  introduce  the  notation  that  is  needed  to  tell  our  story.  We  start  with  an 
nxn  Hermitian  matrix  A  produced  by  a  physicist  or  engineer.  Either  all  or  part  of  A’s 
spectrum  is  wanted  and  the  challenge  is  that  n  may  be  large;  n  ~  104  is  common,  n  —  106 
no  longer  makes  headlines,  and  n  =  108  is  waiting  around  the  comer.  In  several  ways  it 
helps  to  think  of  A  as  a  self-adjoint  linear  operator  and  to  forget  n. 

Physicists  are  in  the  habit  of  talking  glibly  about  diagonalizing  an  Hermitian  matrix 
because  their  instructors  tell  them  that  this  is  always  possible.  The  diagonal  matrix  is 
called  A  and  we  write 

(1)  A  =  ZAZ* 

where  Z  is  a  unitary  matrix  whose  columns  are  eigenvectors  of  A .  For  any  complex  object 
C  we  write  C*  for  its  conjugate  transpose.  In  this  essay  we  are  concerned  only  with  the 
first  step  in  the  task  of  computing  A,  namely  the  production  of  a  real  tridiagonal  matrix  T 
where 
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so  that 

(3)  A  =  QTQ* 

with  <5*  =  Q~l  being  unitary. 

Is  T  canonical  in  the  same  way  as  A?  Certainly  not.  There  are  infinitely  many  essentially 
different  T  matrices,  a  family  with  n  —  1  degrees  of  freedom.  This  freedom  can  be  specified 
very  nicely. 

Theorem  1.1.  If  Equation  (3)  above  holds  and  Q  is  written  by  columns  as  Q  = 
[<7i>  •  ■  • ,  Qn]  then  both  T  and  Q  are  completely  determined  by  A  and  qi  (or  by  A  and  qn). 
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The  proof  of  this  theorem  yields  the  Lanczos  algorithm!  The  proof  supposes  that  all 
arithmetic  operations  are  performed  exactly  and  numbers  represented  exactly.  Before 
presenting  the  proof  we  hasten  to  add  that  this  is  not  the  way  Lanczos  himself  introduced 
his  algorithm.  For  that  story  see  the  commentaries  on  the  1950  paper. 

Proof.  To  prove  the  theorem  it  is  only  necessary  to  equate  columns  on  each  side  of  the 
equation 

AQ  =  QT. 

Column  1  gives  Aqi  —  q\a\  +  q2p2<  Orthogonality  of  Q’s  columns  yields  q*Aqi  =  q\q\Ct\ 
and  the  fact  that  each  column  of  Q  has  Euclidean  length  1  gives  ai  =  q*Aq-^  and 
=  \\Aqi  -  giai||.  Throughout  this  essay  ||v||  Vv*v.  Thus  q 2  -  (. Aqx  -  q\a\)/ fh- 
In  the  same  way,  on  equating  column  k  on  each  side  of  AQ  =  QT  one  finds  that 

otk  =  qkAqk  =  qk(Aqk  ~  Qk-iPk), 

Pk+i  =  \\Aqk-qk-\Pk-qkOikh 
qk+i  =  (Aqk-qk-i0k-qkOik)/Pk+v 

So  if  qk-i,  fiki  and  qk  are  known  then  ak,  j3k+i,  and  qk+i  are  completely  determined.  Notice 
that  we  rather  cleverly  assumed  that  all  ft  values  were  positive.  In  practice,  if  j3k+i  =  0, 
we  are  delighted  because  then  the  linear  span  of  {qi,q2, . . . ,  <&}  is  invariant  under  A  and 
every  eigenvalue  of  Tk  is  an  eigenvalue  of  A. 

Could  anything  be  more  simple? 

Let  us  mention  another  attractive  feature  of  the  algorithm.  We  live  in  an  age  that 
measures  standard  of  living  by  the  amount  that  the  average  citizen  discards.  We  love  to 
throw  things  away.  Notice  that  at  the  end  of  step  k  we  may  throw  away  qk- 1;  it  has  served 
its  purpose.  In  order  to  compute  eigenvalues  it  is  only  necessary  to  store  T  and  to  use 
3  n- vectors  in  the  fast  memory.  The  arithmetic  effort  at  each  step  is  dominated  by  the 
formation  of  Aqk.  Indeed,  if  A  is  a  sparse  matrix  and  the  cost  of  forming  Aqk  is  0(n) 
rather  than  0(n2)  then  we  have  an  algorithm  that  computes  Tn  with  0(n2)  effort. 

It  is  too  good  to  be  true-as  we  shall  soon  see. 


2  Rayleigh-Ritz  Approximations 

There  is  more  to  be  said  about  the  Lanczos  algorithm  in  exact  arithmetic.  Two  matrix 
equations  specify  the  output  of  the  algorithm  at  each  step .  Define 


Qj  [</i  >  *  •  •  j  Qj]  j 

ej  :=  (0,0, ...  ,0,1),  a  row  j  —  vector. 

Then 

(4)  Q*Qj  -  Ij 

(5)  AQj  —  QjTj  =  qj+ipj+iej 
Here  is  a  picture  of  (5) 
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Up  to  this  point  we  have  neglected  an  important  property  of  the  algorithm.  After  an  initial 
phase  is  over,  i.e.  for  large  enough  j,  some  eigenvalues  of  Tj  are  good  approximations  to 
some  eigenvalues  of  A.  Which  are  the  good  approximations  and  how  good  are  they? 

There  is  a  beautiful  answer  to  these  questions. 

The  extra  notation  needed  at  this  point  is  spectral  factorization,  or  di agon alizat ion,  of 


TjSj  —  SjO  ■ 


where 


Sj  ~  1 1  =  S*  is  a  real  orthogonal  j  x  j  matrix, 

s[j\k)  denotes  the  kih  entry  in  the  normalized  eigenvector  of  9^\  Also 


where 


©j  =  diag^ , . . .  ,6{p), 


e\])  <e^]  <...<e(p 


are  the  eigenvalues  of  Tj  which,  henceforth,  will  be  called  Ritz  values  (of  A  at  step  j). 
Associated  with  0^  is  its  Ritz  vector 


yf)  =  Q]s\j)  =  J2 ikSi\k),  i  =  l, . . . , j. 


The  point  of  all  this  is  that  the  pairs  (fff i  =  are  the  Rayleigh-Ritz 

approximations  to  A’s  eigenpairs  from  the  range  of  Qj .  It  is  known  that,  in  several  ways, 
these  are  collectively  the  best  set  of  approximations  from 

range  Qj  =  span(qu  AquA2qu. . . ,  A^'q^). 

For  more  details  on  this  subject  see  (1)  and  [11]. 

To  distinguish  the  good  Ritz  values  from  the  bad  ones  it  is  only  necessary  to 
postmultiply  (5),  the  matrix  form  of  the  3  term  recurrence,  by  to  obtain 

(8)  AQjS. &  -  QjTjs W  =  qj+lpj+^\j),  i  = 

By  (6),  Tjs^  =  s{p6{p,  and  by  (7) 


and,  on  taking  norms, 


\\MJ)  -  vM'  II  =  := 


This  is  a  remarkable  and  rare  situation;  one  can  determine  the  ith  residual  norm,  i.e. 
\\AyP  -  Vi^O^W  without  forming  yV\  That  is  good  news  because  will  not  be  easy 
to  compute  if  all  the  Lanczos  vectors  have  been  discarded  except  for  the  last  three.  The 
right  side  of  (10)  requires  an  algorithm  to  update  Ritz  values  and  last  entries  in  certain 
eigenvectors  of  a  jx  j  tri diagonal  matrix.  When  j  =  40  and  n  =  104  the  difference  in 
arithmetic  effort  is  significant. 
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Well  known  eigenvalue  bounds,  see  [11]  and  [5],  guarantee  that  there  is  at  least  one 
eigenvalue  A  of  A  that  satisfies 

and,  in  fact, 


I A  -  <  /3^)2/gap(i), 

gap(i)  =  nunl/i-0^1, 

where  p  is  an  eigenvalue  of  A  distinct  from  A. 

What  this  means  in  practice  is  that  it  is  the  quantities  |s^(j)Ij  i  =  1,..., j  which 
measure  the  quality  of  6^ .  One  can  judge  before  computing  it.  There  are  two  ways 
to  compute  y^\  Either  save  the  old  q- vectors  gi, . . . ,  qj-2  on  secondary  storage  or  run  the 
Lanczos  algorithm  for  a  second  time  accumulating  y\^  alongside  g*,  k  =  1, 2, . . .  ,j. 

We  now  have  two  orthonormal  bases  for  the  Krylov  subspace  span  (gi,  Agi, . . 
the  Lanczos  basis  {g*}  and  the  Ritz  basis  {y^}-  Note  that,  in  principle, 
the  whole  Ritz  basis  changes  at  each  step.  The  way  in  which  the  6^  approach  A’s 
spectrum  has  received  considerable  attention,  see  [11]  and  [12].  The  larger  the  ratio 
min{Ai+i  —  At,  A  *  —  At_i}/(An  —  Ai)  the  more  rapidly  does  a  Ritz  value  settle  on  to  A 
Even  for  a  randomly  chosen  q\  it  is  not  uncommon  for  several  extreme  eigenvalues  to  be 
approximated  to  8  correct  decimals  after  30  or  40  steps,  independent  of  n. 

As  we  said  before:  It  all  seems  too  good  to  be  true. 

3  Finite  Precision  Arithmetic 

Digital  computers  discard  information  at  almost  every  arithmetic  operation.  When  only 
the  leading  50  or  60  bits  of  each  floating  point  number  are  retained  then  the  beautiful 
relationships  expounded  in  Sections  1  and  2  fail.  In  order  to  describe  what  happens  we 
change  notation  slightly  and  let  Qj  and  Tj  denote  the  quantities  actually  stored  in  the 
computer.  In  contrast  Sj  and  denote  the  exact  spectral  factors  of  the  computed  Tj. 
The  orthogonality  equation  (4)  is  replaced  by 

(11)  QjQj  =  C%  +  Ij  +  Cj 

where  Cj  is  strictly  upper  triangular.  Since  the  computed  qi  are  not  exactly  normalized  Ij 
should  be  replaced  by  some  diagonal  A j  that  is  exceedingly  close  to  Ij  but  such  veracity  is 
not  cost  effective.  The  three  term  recurrence  (5)  becomes 

(12)  AQj  —  QjTj  =  qj-^iPjj-iCj  +  Fj 

where  /*,  column  i  of  Fj)  is  just  the  amount  by  which  the  3  term  recurrence  fails  to  hold 
for  computed  g*.  It  turns  out  that  Fj  remains  at  the  round  off  level  whereas  ||Cy  [|  grows 
rapidly  towards  1  as  j  increases.  We  can  think  of  (12)  as  (5)  contaminated  with  ‘white 
noise’  that  remains  at  the  round  off  level,  i.e.  ||/i|j  is  proportional  to  round  off  unit  and  is 
independent  of  i. 

This  perturbation  to  (5)  has  significant  consequences.  The  algorithm  driven  by  (12) 
never  terminates.  When  j  >  n  the  columns  Qj  cannot  be  linearly  independent,  let  alone 
orthonormal.  In  fact  linear  independence  is  lost  long  before  j  is  close  to  n.  The  next 
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question  is:  how  does  T^n  relate  to  A ?  In  exact  arithmetic  Tn  is  similar  to  A  but  that  is 
impossible  for  Thn- 

The  best  way  to  understand  this  orthogonality  loss  is  through  pictures.  The  figures 
that  follow  show  Cj  as  a  function  of  two  variables;  one  plots  | C(row,  col)  |  as  a  function  of 
(row,  col). 

Figure  1  shows  a  typical  case;  orthogonality  leaks  away.  Close  to  the  diagonal  C  is 
negligible  but  at  step  30  C(k ,  k  +  15)  =  0(1)  for  most  values  of  k  <  15. 

One  cannot  tell  from  Figure  1  whether  linear  independence  has  been  lost  yet  but 
orthogonality  loss  is  total. 

It  is  not  disrespectful  to  say  that  Lanczos  himself,  and  J.  H.  Wilkinson,  the  leading 
expert  in  matrix  computations  from  1960-1984,  both  panicked  at  this  phenomenon.  Each 
of  them  insisted  on  doing  what  we  now  call  full  reorthogonalization.  With  this  modification 
at  step  j  qj+i  is  explicitly  orthonormalized  against  all  preceding  Lanczos  vectors.  See  [7], 
p.  271,  and  [14],  Section  38,  fine  3.  This  precaution  increases  the  storage  requirements 
and  the  arithmetic  effort.  No  longer  is  the  cost  of  step  j  constant,  now  it  grows  linearly 
with  j.  Thus  full  reorthogonalization  seems  to  constrain  users  (except  for  rich  physicists) 
to  making  short  Lanczos  runs  whereas  Krylov  subspace  theory  reveals  the  approximating 
power  of  long  Lanczos  runs. 

The  technique  of  restarting  is  not  a  bad  response  to  the  difficulties  but  neither  is  it 
fully  satisfactory.  The  number  of  extra  applications  of  the  operator  beyond  those  needed 
by  one  long  Lanczos  run  can  be  significant  and  is  not  reported  by  those  developing  restart 
methods. 

4  Hidden  Structure 

Despite  appearances  the  loss  of  orthogonality  among  the  Lanczos  vectors  is  far  from  random. 
A  strong  idea  as  to  what  is  happening  is  given  by  changing  bases  from  Lanczos  to  Ritz.  The 
reader  is  invited  to  contemplate  carefully  Figures  2  through  5  which  show  steps  18-22  in  a 
typical  Lanczos  run.  The  top  half  shows  Cj  and  the  bottom  half  shows  the  strictly  upper 
triangular  part  of  YfYj,  the  Ritz  picture.  Step  18  is  quite  revealing.  Orthogonality,  judged 
by  human  eyesight,  has  been  maintained  beautifully  among  the  Ritz  vectors  EXCEPT  that 
Vis  J  is  &  copy  of  y\7  '  ,  very  nearly.  That  is  not  obvious  from  the  Lanczos  picture  just 
above.  Thus  range  Qi8  had  dimension  17. 

Step  19  (Fig.  3)  seems  to  tell  the  same  story,  except  that  now  y[lg^  is  a  copy  of 
Here  it  is  useful  to  remember  that  y ^  is  the  Ritz  vector  for  the  largest  Ritz  value; 

e[j)  <e(2j) 

Thus  each  of  and  is  very  close  to  the  dominant  eigenvector  of  A. 

Step  20  (Fig.  4)  spoils  the  simplicity  of  Step  19  but  Step  21  restores  it  (almost)  but 
space  limitations  forced  us  to  omit  the  picture.  Step  22  (Fig.  5)  is  very  like  Step  21;  there 
are  20  orthogonal  Ritz  vectors  plus  a  spare  copy  of  the  two  Ritz  vectors  that  have  converged 
to  the  two  dominant  eigenvectors. 

The  idea  of  changing  bases  originated  with  C.C.  Paige  and  was  described  in  his  Ph.D. 
dissertation  in  1970/71  (London  University)  and  in  [9]  and  [10].  Actually  neither  the 
Lanczos  basis  alone  nor  the  Ritz  basis  alone  tells  the  story  in  its  simplest  terms.  The  key 
idea  is  to  look  at  the  angle  between  qj+ 1  and  the  previous  Ritz  vectors  (yj/\ . . . ,  y^).  This 
information  is  available  at  the  end  of  step  j. 
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The  next  set  of  pictures,  Figures  6  to  10  reveal  Paige’s  discovery.  We  repeat  the  previous 
run;  the  picture  plots  \yi*^*qj+i\  as  a  function  of  (i,  j  +  1).  It  is  helpful  to  remember  that 
qj. i_i  is  being  tested  against  a  different  basis  than  is  qj. 

When  this  is  understood  one  notices  that,  in  Fig.  6,  qis  is  bent  towards  y\7  and, 
to  a  smaller  extent  q\7  is  bent  towards  .  Yet  both  these  Ritz  vectors  are  good 
approximations  to  A’ s  dominant  eigenvector.  Thus  qy?  and  qis  are  pulled  in  the  same 
direction,  qis  more  <717.  Step  19  (Fig.  7)  is  a  little  disconcerting,  at  first,  because  qiQ  has 
maintained  better  orthogonality  than  did  qis-  It  is  true;  there  is  nothing  monotonic  in  this 
orthogonality  loss.  By  Step  22  we  can  see  that  6^  is  on  its  way  to  becoming  a  third  copy 
of  the  dominant  eigenvector! 

There  is  a  beautiful  result  behind  the  foregoing  discussion. 

Theorem  4.1  (Paige’s  Theorem).  If  local  orthogonality  is  maintained  in  the  Lanczos 
process,  governed  by  (12),  i.e. 
if  qt+iQk  —  0  for  all  k ,  then  for  each  i  <  j , 


(a)  y*Qj+ 1  = 


lij  __  7 ij 

Pj+l^i  (j) 


(b)  for  k<j ,  i  ^  k, 


(@i  —  Ok)pi  pk  —  7« 


where 

ry)  =  (7^)  =  S*  [upper  triangle  (Q*Fj  -  FjQj)]Sj 

and  upper  triangle  ( M )  denotes  the  strictly  upper  triangular  part  of  M  with  the  rest  of  the 
matrix  filled  with  zero  entries. 

For  proofs  see  [9],  [10]  and  [11]. 

The  striking  feature  of  part  (a)  is  that  the  denominator  is  precisely  the  formula  for 
|| Ayi  —  yi0i\\  in  exact  arithmetic. 


5  The  Lanczos  Phenomenon 

Paige’s  work  stimulated  a  variety  of  implementations  of  the  Lanczos  algorithm  which  differ 
in  the  extent  to  which  orthogonality  of  the  {(ft}  is  maintained.  However  that  is  not  the 
subject  of  this  essay. 

Experience  with  the  simple,  minimal  effort,  algorithm  revealed  that  Ritz  values  cluster 
very  closely  round  the  eigenvalues  of  A  as  the  number  of  steps  increases.  There  will  be 
perhaps  hundreds  of  Ritz  values  within  0(e||A||)  of  the  first  eigenvalue  to  be  found  before  the 
last  one  has  a  single  Ritz  value  beside  it.  Nevertheless  all  eigenvalues  are  found  eventually, 
thanks  to  round  off,  even  when  the  starting  vector  is  orthogonal  to  an  eigenvector  of  A. 
The  only  known  exception  is  the  artificial  case  when  A  is  diagonal  and  the  starting  vector 
has  a  zero  entry. 

At  any  particular  value  of  j  there  may  be  Ritz  values  that  are  not  eigenvalues  of 
A  but  the  corresponding  values  will  not  be  small  and,  sooner  or  later,  that  Ritz 

value  o\^  will  move.  The  way  in  which  A’s  spectrum  is  revealed  by  Ritz  value  behavior 
is  a  complicated  function  of  the  starting  vector,  its  relation  to  A’s  eigenvectors,  and  the 
distribution  of  A’s  spectrum. 

Anne  Greenbaum,  in  [3]  produced  a  backward  error  analysis  which  showed  that  for  a 
given  number  of  steps  J  and  a  computed  tridiagonal  Tj  there  is  a  matrix  A'  and  a  starting 


100  Parlett 


vector  qx  such  that  the  exact  Lanczos  algorithm  applied  to  X ,  q\  will  produce  Tj.  This 
work,  and  further  extensions  of  it,  are  described  in  this  volume  in  the  section  written  by 
Greenbaum. 

In  (2]  came  the  first  proofs  of  the  Lanczos  phenomenon:  all  eigenvalues  are  captured 
eventually.  The  results  are  full  of  technical  details  and  we  state  them  informally  here. 

Theorem  5.1  (Druskin  and  Knizherman).  Let  Azr  =  zrAr,  ||zr||  =  1.  If 
\\A\\  <  0.9,  if  the  round  off  unit  is  small  enough  compared  to  j,  if  <f>r  :=  q\zT  ^  0,  and 
if]n(17j/<f%)  <  j  -  2,  then  for  some  i  <  j 


^•)  _  ^  <  ln(17 


oo 


The  important  point  here  is  that,  in  exact  arithmetic,  the  analogous  error  bound  is  0(1/ j) 
for  j  <  n.  The  results  reflect  worst  case  situations. 

A  much  more  realistic  bound  is  derived  in  terms  of  the  gap  7r  separating  Ar  from  the 
rest  of  the  spectrum.  The  bound  is  too  complicated  to  give  insight  of  itself  but  it  may  be 
used  to  generate  useful  diagrams.  It  seems  pointless  to  copy  the  result  in  detail.  It  says 
that  under  a  lot  of  apparently  reasonable  conditions  on  e  there  is  an  index  i  such  that 


\9lj)  —  Ar|  <  K 


for  any  k  that  satisfies  an  inequality  of  the  form 


4>rK 


^  l<M\/jei  +  2(1  +  Vjt\)  • 


2j<3 


+  2^6,  +  (1  +  ■/] )/Tm  ( 


1-7? 


where  T*  denotes  the  Chebyshev  polynomial  of  the  first  kind  of  degree  j.  Here  e,  and  £3  are 
functions  of  e.  Note  the  presence  of  k  on  both  sides  of  the  inequality.  For  a  small  enough 
round  off  unit  the  bound  reduces  to 


\8\j)  -  Arl  < 


m  ±  vi) 

I) 


which  reminds  specialists  of  the  Kaniel-Paige-Saad  bounds  in  exact  arithmetic,  see  [12]. 

Knizherman  [6]  has  further  recent  results  (private  communication)  which  show  that  the 
Ritz  values  cluster  in  intervals  of  radius  0(e||A||)  about  the  eigenvalues.  This  is  a  significant 
improvement  on  earlier  results  of  the  form  0(y/l\\A\\). 
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Fig.  2.  n  =  100  ratio  ~  0.87  rel  gap  =  0.13  round  off  =  le-7  step  18 
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FlG.  3.  n  =  100  ratio  —  0.87  r el  gap  =  0.13  round  off  =  le-7  step  19 


104  Parlett 


Fig.  4.  n  —  100  ratio  =  0.87  ret  gap  —  0 AS  round  off  =  le-7  step  20 
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Fig.  5.  n  =  100  ratio  =  0.87  rel  gap  =  0.13  round  off  =  le-7  step  22 
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Fig.  7. 


n  =  100  ratio  =  0.87  rel  gap  =  0.13  round  off  =  le-7  step  19 
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Fig.  8.  n  =  100  ratio  =  0.87  rel  gap  —  0.13  round  off  =  le-7  step  20 


FlG.  9.  n  =  100  ratio  =  0.87  rel  gap  =  0.13  round  off  =  le-7  step  22 


